Pennsylvania State University Researchers Leverage CIROH Cyberinfrastructure for Advanced Hydrological Modeling
Pennsylvania State University (PSU) researchers have been leveraging CIROH Cyberinfrastructure to tackle complex hydrological modeling challenges. This post highlights their innovative approach using the Wukong computing platform in conjunction with Amazon S3 bucket storage to efficiently process and analyze large-scale environmental datasets. π
π» The Computing and Storage Infrastructureβ
Wukong Computing Platformβ
The PSU team has been utilizing Wukong, a high-performance computing (HPC) cluster specifically designed for data-intensive scientific applications, such as the high-resolution physics-informed machine learning for national water modeling (Song et al. 2024[YS1] ). Wukong provides the computational power necessary for running complex simulations and processing large environmental datasets that traditional computing resources would struggle with. π
Key advantages of Wukong include:β
- π― Large GPU capacity for high-resolution ML/differentiable process-based models
- βοΈ Scalable parallel processing capabilities
- π Optimized performance for data-intensive workloads
- β³ Reduced processing time for big data
- π Support for multi-node computation to handle larger geographical areas
S3 Bucket Integration:β
To complement Wukongβs computational power, the PSU researchers and AWI DevOps staff implemented Amazon S3 (Simple Storage Service) buckets as their secondary data storage solution. This integration offers several benefits:
- ποΈ Virtually unlimited storage capacity for growing datasets
- π Data durability and redundancy
- π° Cost-effective long-term storage, with the use of S3 intelligent tiering to automate the storage cost savings by moving data when access patterns change
- π Seamless data transfer between computing nodes
- π Version control for dataset iterations
- π€ Easy data sharing with users not on Wukong
π¬ Research Applicationsβ
The PSU team has applied this powerful computing infrastructure to several critical research areas:
-
National Streamflow Modeling π
Training differentiable hydrologic models with high-resolution forcing and static attribute data across extensive geographical regions using observations from thousands of gauges, followed by whole-domain forwarding. -
National River Routing πΊοΈ
Conducting river routing on MERIT/HydroFabric river networks, combined with neural network-supported routing parameter learning. -
NextGen Candidate Models & Data Assimilation π
Applying multiple NextGen candidate models and data assimilation algorithms within the differentiable modeling framework, which supports compliance with BMI. -
Foundation Model Development ποΈ
Developing a foundation model to explore co-evolution between landscapes.
Thank you to all those who contributed towards this effort.β
π Learn Moreβ
For more details on the Wukong computing platform, check out the official documentation:
π Wukong Documentation
For the full research paper by Song et al. (2024), visit:
π DOI: 10.22541/essoar.172736277.74497104/v1